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Abstract — An antidictionary code is a lossless compression 
algorithm using an antidictionary which is a set of minimal 
words that do not occur as substrings in an input string. 
The code was proposed by Crochemore et al. in 2000, and its 
asymptotic optimality has been proved with respect to only a 
specific information source, called balanced binary source that is 
a binary Markov source in which a state transition occurs with 
probability 1/2 or 1. In this paper, we prove the optimality of 
both static and dynamic antidictionary codes with respect to a 
stationary ergodic Markov source on finite alphabet such that a 
state transition occurs with probability p (0 < p < 1). 

I. Introduction 

This paper proves two theorems with respect to asymptotic 
optimality of both static and dynamic antidictionary codes 
for stationary ergodic Markov information sources. An an- 
tidictionary for a given string is a set of words of minimal 
length that never appear in the string, and it is in particular 
useful for data compression. An antidictionary coding scheme, 
called Data Compression using Antidictionaries (DCA), was 
first proposed by Crochemore et al. [1] for binary strings. 
Some extensions of the DCA, which are able to handle a 
finite alphabet and applied to arithmetic codes, have been 
proposed [2]-[4] (cf. [5]). Those algorithms work in an 
off-line manner, while some on-line DCA algorithms using 
dynamic suffix trees work with linear time and space have 
been proposed [6]-[8]. Moreover, a memory-efficient DCA 
using suffix arrays was proposed [9]. It was shown that 
the algorithm [8] achieves compression ratios as well as an 
efficient off-line data compression algorithm using Burrows- 
Wheeler transformation [10] by simulation results. 

On the other hand, for only balanced binary sources, asymp- 
totic optimality of a static DCA algorithm has been proved [ 1 ] . 
It was shown that the algorithm is asymptotically optimal for 
the source generated by an antidictionary if and only if the 
antidictionary is given to the algorithm in advance [1]. The 
averaged code length per symbol converges to the entropy 
rate of the source with probability one. The balanced binary 
source is a Markov source of finite order and emits all the 
strings which do not contain any word of the antidictionary as 
the substrings. Moreover, for any state of the Markov source 
with only one outgoing edge, probability one is assigned to 
each edge, while for that with two outgoing edges, probability 
1/2 is assigned to those edges. 

In this paper, we prove asymptotic optimality of a static 
and a dynamic DCA for a Markov source constructed from 
an antidictionary on finite alphabet such that a state transition 
occurs with probability p (0 < p < 1). This paper is organized 
as follows. Section In] gives the basic definitions and notations. 
Section |lll| shows review of the DCA algorithms. Section HVl 
proves two theorems with respect to the asymptotic optimality 



of a static and a dynamic antidictionary code, respectively. 
Section [V] summarizes our results. 

II. Basic Definitions and Notations 

Let A" = {0, 1, . . . , J-1} be a finite alphabet and X* be 
the set of all finite strings over X, including the null string of 
length zero, denoted by A. For X and x G X* , \X\ and \x\ 
represent the size of X and the length of x, respectively. For a 
string X = xiX2 ■ ■ - Xn € X" of length n, let T^{x) be the set 
of all suffixes of x, that is, = {xiXi+i . . .a;„|l < i < 

n} U {A}, and let 2?(a;) be the dictionary of all substrings of 
X, that is, Vlx) = {xiXi+i ■ ■ ■ Xj\l < i < j < n} U {A}. Let 
x^ be the prefix of length i of x, and we define that x'^ ~ A. 

A. Markov Source 

Let A C <Y*\{A} be a non-empty finite set, and we assume 
that no word it e ^ is a substring of any v & A such as 
V ^ u. Crochemore et al. showed a deterministic automaton 
F{A) which accepts all strings that contain no strings of A as 
their substrings [11]. In [1], F{A) is used as an encoder and a 
decoder of static DCA algorithm. The set A will be referred to 
as the antidictionary and a string in A will be referred to as the 
Minimal Forbidden Word (MFW). A deterministic automaton 
F{A) ~ {U, X, si, A) is defined as follows: Let s{w) be the 
state corresponding to string w in F{A). In other words, s{w) 
is the state reached by string w from the initial state si. 

• The initial state si is s(A). 

• A state s{v) for ?; G .4 is called sink state. Any sink 
state has \X\ outgoing edges, all having distinct labels, 
and all the edges of the state terminate the state. 

» U = {i6|itisa proper prefix of v € A}. Note that 
a proper prefix of v = f 1^2 . . . Vi is any of strings 
V1V2 ■ --Vj for 1 < j < i, or A. A state s{u) has \X\ 
outgoing edges, all having distinct labels. These edges 
are defined in the following manner: for each a £ X, 

(i) if ua G U, then the edge labeled a from s{u) 
terminates at s{ua). 

(ii) if ua ^ U, then the edge labeled a from s{u) 
terminates at s{w), where w is the longest suffix 
of ua such as w E {U U A). 

Let G{A) be the automaton obtained by deleting from F{A) 
all sink states and all edges incoming sink states. Fig. [l] 
shows G{A) and F(A), where A = {11,000,10101} and 
X = {0,1}. In Fig. [T] the solid lines and circles represent 
G{A), while G{A) with the dotted lines and squares represents 
F{A), where squares represent sink states. To avoid trivial 
cases, we suppose that any state of G{A) has at least one 
outgoing edge. For a state s of G{A), let £{s) be the set of 
labeled symbols of all outgoing edges from s. 




^r,;o, 1 



Fig. 1. The automaton GiA) and F{A) for A = {11, 000, 10101}. 



Let S be the set of all states of G{A), and let 5*1 and 
S'2 be the set of all states having only one outgoing edge 
and that of all states having at least two outgoing edges, 
respectively. For G{A), let T : 5 x A" — > 5 be transition 
probabilities independent of time called transition probability 
matrix. A stationary Markov (or unifilar, cf. [12]) source 
is characterized by T of G{A), and let (/ii, /i2, . . . , /^|s|) be 
the stationary distribution whose components are the stationary 
probabilities of their states. We call antidictionary source 
in this paper 

Moreover, is a source called shift of finite type [13] 
since X^ is described by a finite set of forbidden strings. 
Hence, X^ is a stationary ergodic source [13]. A sequence 
X" = X1X2 ■ ■ ■ X„ represents the sequence of random vari- 
ables of length n on X^ = {Xj : j = 1,2,...}. For a state 
Si of G{A) in X^, Pic represents the transition probability of 
the outgoing edge from with label c. The entropy £f(X^) 
is given by 

\x\-i 

H{Xj^)^~ ^ ^ PjclogzPic, (1) 

i:SieS2 c=0 

where Olog^O = 0. Specially, if X^ satisfies that \X\ = 2, 
Pjo = Pji = 1/2 for any sj e S'2 and pko = 1 or pki = 1 for 
any £ Si, then X^ is called binary balanced source. 

The automaton G{A) has a useful property, called synchro- 
nization property [1]. For a state s^, let l{si) be the locus string 
u such that Si ~ s{u) and u £ U are satisfied. Notice that 

s{l{Si)) = Si. 

Let u and v be the string l{si) and l{sj) for states Si and 
Sj {i 7^ j), respectively, and let m be length of the longest 
MFW in A. Then, we have the following theorem. 

Theorem A (Theorem 3 [1 ]): For any string w G X* of 
length m — 1, if both strings uw and vw do not contain any 
string of A as the substrings, then s{uw) = s{vw). 

In other words, suppose that Sd and Se are the states reached 
by w from and sj, respectively, so that Sd = s^ if the 
conditions are satisfied shown in Theorem lAl In Fig.lT] rr; — 1 
is given by 4 since length of the longest MFW, that is 10101, 
is 5. As an example, for si, S5 and w ~ 0100, the states 
reached by w from si and S5 are the same state S3. 

B. Suffix Tree 

The suffix tree of a; is a tree structure [14] that stores all 
elements of 'S{x). Let T,; be the suffix tree of x\ The string 
associated with the path from the root p to a node p in is 
denoted by w{p), and we define that w{p) is A. The string 
length \w{p)\ will be referred to the depth of p. For any node 
p in Ti, let Ci{p) be the set of labeled symbols of all edges 



sprouting from p, that is, Ci{p) = {a\w{p)a G 2?(a;'),a G 
X}. For any node p p, we can write w{p) = av, where 
a € X and v G X* . Let q be the node such that w{q) = v, 
and a pointer from p to q, denoted by a{p), is called suffix 
link. For a given depth d > Q, if \w{p)\ > d, then let ad{p) 
be a node of depth d pointed by one of a series of suffix links 
starting from p and moving back to the root p. 

Definition 1 (active point): An active point ai in is the 
node corresponding to the string u such that the longest string 
in (S(a;') n2?(a;'^^)) where ag is the root p. 

The active point plays a key roll in the on-line algorithm, 
called the Ukkonen algorithm, for constructing suffix trees 
with the linear complexity [15]. 

III. Review of the DCA algorithms 

First, we describe a static DCA algorithm [4]. We suppose 
that Assumption [T] is satisfied for the static DCA algorithm. 

Assumption 1: The static DCA algorithm knows A. 
From Assumption [T] notice that G{A) plays as the encoder / 
decoder parts of the algorithm since G{A) is constructed from 
A. Table H] shows output for x^+i in the static DCA algorithm. 
In Case-(l), no symbol is output, that is, i-^+i is predictable 

table I 

Output for Xi+i in the static DCA algorithm. 



Case 


\S{s{x^))\ 


Output 


(1) 


1 


none 


(2) 


at least 2 


e(Pr(a;.+i|s(a;'))) 



since there exists only one outgoing edge from s(a;*). In 
Case-(2), e(-) represents an adaptive arithmetic coder of order- 
(cf. [16]). The probability PT{xi+i\s{x^)) is calculated by 
N{x,+i\s{x'))/ Ec&x^ic\six% where N{c\s{x')) is a 
counter that has the number of traversed times from s(a;*) 
with symbol c. Note that for Sk, if c G £{sk), then the initial 
value of N{c\sk) is set to 1. Otherwise its initial value is 0. 
For a given input string x of length n, the codeword of the 
static DCA algorithm is given by the triplet, that is, 

iA,e{x),n). (2) 

Next, we describe a dynamic DCA algorithm [8]. The 
algorithm uses a subtree of the dynamic suffix tree, which 
has a given fixed depth d+1 {d > 0). In [8], a node Pi in T^, 
called modified active point, is used to encode symbol .Xi+i. 
The node Pi is defined as follows: 

Definition 2 ( modified active point): For a given fixed inte- 
ger d > 0, 

ID _\ OL, {\w{ai)\ < d), 

^'~\ <Jd{a^) {\w{a,)\>d). ^3) 

Table HI] shows the output for x^+i in the dynamic DCA 
algorithm. In Case-(O), the pair (/, R(a;i+i)) is output, where 
/ represents an interval of insertion of new edge, and R(.t,;+i) 
represents the rank of x^+i (1 < R(.Ti+i) < \X\). Let Ci{f3i) 
be a set {a\w{(3i)a G 2?(a;'),a G X}. Let TZi be a set of the 
longest string w{p)c in {Tj{w{l3i)c) n2?(a;')) or {c} for each 
c G {X\Ci{jii)). Suppose that w{p)a,w{q)b G TZi, a 7^ b. 



TABLE II 

Output for xi+i in the dynamic DCA algorithm. 



Case 


Relationship between 

/3i and x-,+i 


Output 


(0) 




(7, K(x-,+ i)) 


(1) 






= 1 and Xi+i G £i(/3i) 


none 


(2) 






> 2 and Xi+i G 


e(Pr(a;.+i|A)) 



If a following condition in (|4]i, (|5]) and (|6l) is satisfied, then 
R(a) < R(6). 

|u;(p)a| > (4) 

\w{p)a\ = and iV(a|p) > N{b\q), (5) 

|u;(p)a| = \w{q)b\, N{a\p) = N{b\q) and 

a < & (in lexicographical), (6) 

where N{-\-) is a counter used in Case-(2). The rank R{xi^i) 
is determined by traversing up suffix links starting from f3i 
to p and is the rank of the string which has x^+i as the last 
symbol in TZi. The rank R{xi+i) is used to convert Xi+i into 
a small integer to improve the compression ratio. The reason 
is that a symbol c € {X\Ci{pi)) having high probability will 
be found at a node near Pi on the suffix links. The details are 
described in [7]. 

In Case-(l), no symbol is output since x^+i is predictable 
from the fact that there exists only one edge from (3i. In Case- 
(2), the probability Pi{xi-^i\/3i) is calculated by N{xi+i\(ii)/ 
J2cex -^i^ll^i)' where N{c\f3i) is a counter that has the 
number of traversed times from the internal node Pj with 
symbol c (0 < j < i — 1). Note that for an internal node 
Uk of Ti such as \Ci{nk) \ > 2, if c G Ci{nk), then the initial 
value of N{c\nk) is set to 1. Otherwise its initial value is 0. 

Let be the codeword length per symbol of the static 
DCA algorithm for a random string of length n. That is, 

is given by (the codeword length) / n. Then, the following 
theorem holds. 

Theorem B: [Theorem 7 [1]] Under Assumption 1, for a 
balanced binary source X^, converges to H{X.y[) with 
probability one as n ^ oo. 

IV. Main Results 

If is stationary ergodic, then we obtain the following 
theorem for the static DCA algorithm. 

Theorem 1: Under Assumption 1, for a stationary ergodic 
source X^i, converges to H{X.y[) with probability one as 
n oo. 

Now, let if^ be the codeword length per symbol of the 
dynamic DCA algorithm for a random string of length n. And 
let m be the length of the longest MFW in A. Moreover, 
we have the following assumption on the dynamic DCA 
algorithm. 

Assumption 2: Both encoder and decoder of the dynamic 
DCA algorithm do not know A while they know m. 

Theorem 2: Under Assumption 2, for a stationary ergodic 
source X^, if^ converges to iJ(X^) with probability one as 
n — oo. 

A. Proof of Theorem |7] 

We use three lemmas to prove Theorem [T] Let 6*2,0 and 
»5'2.oo be the set of states in S2 for fii ~ and > 0, 
respectively. For X", let Yi,n be a random variable taking 



values in the number of traversed times of Si, and let |e(X")| 
be a random variable taking values in the length of output of 
Case-(2), that is e{x) in (|2]i. For a given symbol c G X and 
Si G 52. 00, let Zic^h be a random variable, when Si is traversed 
at the hth time, such as 



Zic.h — 



{z + c), 



(7) 



where z is the labeled symbol of traversed outgoing edge from 
Si at the time. For a positive integer fc, [Zidk is given by 

[Zic]k = (Zici + Zic.2 H V Zic^k)/k. 

Lemma 1: Pr{lim„^oo ^^^n/j^ = Mt} = 1- 
Lemma 2: Pr{lim„^oo[-^4c](y,,„) =Kc} = 1- 
Lemma 3: Pr{limsup„_j.Q<;j |e(X")|/n ~ _ff(X^)} = 1. 
(Proof of Lemma |7).- From the definition of X^, the 
steady state probability of Si is given by /i.;. Therefore, the 
lemma holds. ■ 
(Proof of Lemma A sequence Z — Zic.iZic.2 ■ ■ ■ 
is i.i.d. and Zic,h {h = 1,2,...) has the same probability 
distribution. And, from the definition of Zic^h, the expected 
value E(_Zic,/i) equals to pic- Moreover, for Si G S'2,00, from 
Lemma [T] Yi^n diverges to infinity as n — >^ 00 with prob. 1. 
Therefore, from the strong law of large numbers, the lemma 
holds. ■ 
( Proof of Lemma \3}: 



lim sup 



|e(X")| 



(a) 




(yi,„)log2[^jc](y,, 



limsup-^ y^[^»c](y.,„) 

Tl — ^ 00 ^ 

i:Si eS2 c=0 



E 



E 




I A' 1-1 

Ml E P»cl0g2Pj, 

c=0 



)(8) 

»cj(Y;;.„)log2[^d(y.,„) (9) 
log2[^ic](y,,„) 
(i',,„)log2[^ic](y,,„j;i0) 
(11) 



(12) 



where (a) follows from the fact that an index i of state in 
G{A) is independent of n, and (b) follows that addition to 
Lemma [T] Lemma |2] and the first term of right-hand side of 
([Tol l converges to with prob. 1 as n — > 00 since /i; = for 
any SiG 6*2,0, and (c) follows from ■ 
(Proof of Theorem\J]jl: From (|2]), is given by 



L < limsup 

' n 



(13) 



where jf^A is a size of list of all the MFWs in A, and CL>*(n) 
is a representation of n using the Elias uj* code for positive 
integers [17] (cf. [12]). The length |cL>*(n)| is given by 



\uj* (n) I < log2 n + 2 log^ (log2 n) + 7. 



(14) 



From ( fT4] i. the third term of the right-hand side of ( fT3] l 
converges to as n — !■ cxd. From Assumption [T] is a 
constant, so that the first term also converges to as n ^ oo. 
Therefore, from Lemma [3] the theorem holds with prob. 1. ■ 

B. Proof of Theorem |2] 

We use eight lemmas to prove Theorem [T] For a given 
fixed integer m > 1 in Assumption |2l we use m — 1 as the 
depth d in Definition |2] that is 



(Proof of Lemma^: From Lemmas |5] and |7] the lemma 



holds. 



(Proof of Lemma Suppose that ,s(a;") 



Sj. From 



d = m — \. 



(15) 



dcf 



We define a random variable T4 — Xi^X^j^i . . . X^^^ G 
^d+i fQj. ^ I por Vk, a random variable is defined as 



Qk 



{^i:Vk^v^V,,{l<i<k-l)), 
{V,^v^V,,{l<''i<k-l), 



(16) 



where u is a string satisfying that Pr{Vi = ij} > 0. Note that 
we define that Qi takes value I. For a string on X^, let 
An be the set of all nodes whose depth is d in ¥„, and for 
any state sj (1 < j < of G{A), we define that Aj,„ = 



Lemma|5] if /3„ G A„, then /3„ £ Aj,„, that is s (to (/?„)) = s^. 
On the other hand, if /3„ ^ A„, that is \w{/3n)\ < d, then 
s (it) (/?„)) ^ Sj can hold. 

We evaluate that the maximum total number M of occur- 
rences such that < d for < k < n—1. Let w be the 
suffix of cc" of length d. If w then |i«(/3„)| < d 
from Definition |2] On the other hand, if all the strings, whose 
lengths are not more than d, are included in P(a;"^^), then 
■"^(/^n)! = d. Therefore, M is the total number of strings 
in X*, whose length are not more than d since 'D{x'^~^) is 
monotone increasing with respect to n. Hence, M is given 
by - - 1) for \X\ > 2. In other words, it is 
equal to the number of nodes of a tree, called jA'j-ary tree, 
such that any external node has depth d and any internal node 
has exactly \X\ descendants (cf. [19]). Note that for \X\ = 1, 
the total number is given by d. By using M, for any Aj.„, the 
following equation holds. 



{P 



-s{w{p)),p e A„}. Note that for a node p G A„, the „ il£ < 



unique state of G{A) is determined from Theorem |A] since 
\w{p)\ = d and d = m — 1. 

For a node p, let Nn{p) be the random number of times 
passed p {0 < h < n~l). For a given symbol c £ X and 
p G Aj_„, let ZjcM be a random variable, when p is traversed 
at the fcth time, such as 



n 



M 



E 



^ + — . (18) 

n 



M 
n 



Zjc.k = 



{z = c), 
^ c), 



where z is the labeled symbol of traversed edge from p at 
the time. For a positive integer g, [Zjdg is given by [Zjdg = 
{Zjc,i + Zjc,2 + ■ • ■ + Zjc.g)/ g. Let £)„ be a random variable 
taking the depth of /3„, that is \w{l3n)\, and let En be a random 
variable taking the index of s(i(; (/?„)). 

Lemma 4: If Xn-d+iXn-d+2 ■ ■ - Xn G ^(a;"^^), then 

\w{Pn)\=d. 

Lemma 5: If /3„ G A„, then s{w{(3n)) = s(a;"). 

Lemma 6: Pr{lim„_^oo Qn = 0} = 1. 

Lemma 7: Pr{lim„_i,oo -D„ = d} = 1. 

Lemma 8: Pr{lim„_>oo = s(a;")} = 1. 

Lemma 9: Pr{lim„^oo Z^peA Nn{p)/n = A^j} = 1. 



Since | A"! and d are constants, M is a constant. Hence, the term 
M /n converges to as n — > cxd. Therefore, Yj ^/n converges 
to yUj as 71 — > oo from Lemma [T] so that the lemma holds. ■ 
(Proof of Lemma [TUj : Due to p G Aj „, we have 
s{w{p)) = Sj. Hence, limsup„_^o^ £„(p) = £{sj). Next, we 
(•^y-) will show that Pr{liminf„^oo 'C„(p) = ^^{sj)} = 1. For a 
string x", T>(x^) is monotone increasing with respect to n, 
so that we have Cn{p) C Cn+i{p)- Moreover, from Lemma |6] 
for any c G £{sj), w{p)c G P(a;"^^) as — !> oo with prob. I. 
Therefore, Pr{liminf„^oo = £{sj)} = 1. Hence, the 
lemma holds. ■ 
(Proof of Lemma \T7}: Due to p G Aj,„, we have 
3{w{p)) = Sj. Therefore, from Lemmas |6] and [TO] Zjc,k 



Lemma 10: For p G Aj „, Pr{lim„ 



-i.oo[^jc](Ar„(p)) 



1. 

Lemma 11: For p G Aj.„, Pr{li: 

(Proof of Lemma^: Since v = a;„_(;_|^ia;„_d+2 . • . a;„ G 
I](a;"), we have G {Y.{x'')r\V{x''-'^)). From Definition[I] 
we obtain |i(;(a„)| > \v\ = d. Therefore, we have \w{l3n)\ = d 
from ©. ■ 

(Proof of Lemma^: Since /3„ G A„, we have w{f3n) — 
w~ Xn-d+iXn-d+2 . ■ . Xn and |a;"| > \w\ = d. From Theo- 
rem |A] and s{w) = s(a;"~''uj), we have s{w{f3n)) = s(a;"). 

■ 

(Proof of Lemma^: Since is a stationary ergodic 
source, the lemma holds (cf. [18]). ■ 

(Proof of Lemma [7|.- Since d is a constant, 
Pr{lim„_>oo Qn~d+i = 0} = 1 from Lemma |6] There- 
fore, there exists j (1 < j < n — d + 1) such that 
Xn-d+iXn-d+2 . ■ .Xn ~ XjXjj^x . . . Xjj^d-\ with proba- 
bility L Hence from Lemma |4] the lemma holds. ■ 



has the same probability distribution of ^jc,fe, and E(Zjc,fc) 
equals to E(Zjc,fc) (k = 1,2,...). Hence, E(Zjc.fc) equals 
to pj(.. Moreover, a sequence Z = Zjc xZjc.2 ■ ■ ■ is i-i-d. 
Since Xyi is supposed to be a stationary ergodic source and 
Pr{Vi = w(ji)\ > 0, Nn{p) diverges to infinity with prob. 1 
as n oo. Therefore, from the strong law of large numbers, 
the lemma holds. ■ 
(Proof of Theorem ^: Let C(X" ) be the codeword 
length achieved by the dynamic DCA algorithm, and let 
Co(X") and C2(X") be the codeword length in Case-(O) and 
Case-(2), respectively, that is, 

C(X") = Co(X") + C2(X"). (19) 

Therefore, if^ is given by 

C(X") 



lim 

n— >oo 



(20) 



First, we evaluate Co(X"). Let no be the total number of 
occurrences of Case-(O) for a given a;" on X_4, and let /q and 
_Ro be the maximum code length of / and R(a-i+i) shown 
in Table HI] for < i < n - 1. Suppose that \X\ > 2. The 
value no is not more than the total number of strings whose 
length is not more than d+1 in X*, since Case-(O) occurs if 
w{/3i)xi^i ^ T>{x^). Therefore, we obtain 

no<{\Xf+^ -1)/{\X\-1). (21) 



Moreover, the maximum length of / is n. Hence, by using 
Elias oj* code, we obtain 



lo < \uj*in)\. 



(22) 



Moreover, for p G Aj „, from Lemma [TT] and ( [3T] l, 

\x\-i 



By using a fixed length code for a symbol with respect to 
Ro^\og2\X\. (23) 



From (EB, (123, and 

Coix")/n < no-(Io + Ro)/n (24) 
(|A'|^+2-l).(lc^*(n)|+log2|A'| 



< 



.(25) 



Since \X\ and d are constants, from (fT4l i. Co(a;")/n converges 
to as 71 — !■ cx). Therefore, 



(26) 

i?0 = 



lim Co(a;")/n = 0. 

n— >oo 

Note that in case of jA"! = 1, since no < d+1 and Iq 
1, equation (|26] | holds. 

Next, we evaluate C2(X"). For a given a;", let l{p) be the 
averaged code length of Case-(2) for a node p in T„. Note 
that l{p) < oo since jA"! is finite. We have 

""'^ ^ < limsup- ^ Nnip)l{p)(27) 

™ "|£„(P)|>2 



lim 

n^oc 



lim sup — 

77,— VOO ^ 



E 



|C„(p)|>2,p^A„ 



lim sup ■ 



1 



(28) 



£„(p)|>2,peA„ 



For p ^ A„, the maximum value of Nn{p) is less than or 
equal to the total number M of strings whose lengths are 
not more than d in X*, that is M described in the proof of 
Lemma |9] Therefore, the first term in the right-hand side of 
(l28T l converges to as n — > cjo since M is a constant. Let £„ 
be the first term. From ( |28] |. we obtain 



lim 

n—^oo 



(32) 



c=0 



with prob. 1 as n 



lim 

n— ^oo 



< 



OO. From dSOl l, ( |32] |. and Lemma |9l 

E PjclogaPjc (33) 



E 



with prob. L From ( |33] | and ([TJ 



lim 

>-oo 



(34) 



< e„ + limsup- 7V„(p)/(p) 



(a) 



E 

i:SjSS2,o 



limsup- ^ Nn{p)l{p) 

1 



with prob. 1. From ([T9]l, (|20l), dM, <ISll, and since e"^ 
converges to with prob. 1 as n — !> oo, we obtain 

li - if(X^) (35) 

with prob. 1 as 77. — !> 00. Therefore, the theorem holds. ■ 

V. Conclusion 

In this paper, we proved asymptotic optimality of both static 
and dynamic DCA algorithms with respect to antidictionary 
sources, that is a stationary ergodic Markov source driven by 
G{A). The averaged code length per symbol of the algorithms 
converge to the entropy rate of the source with probability one. 
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